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Persistent homology is a method for probing topological properties of point clouds and functions. The 
method involves tracking the birth and death of topological features as one varies a tuning parameter. 
Features with short lifetimes are informally considered to be "topological noise." In this paper, we bring 
some statistical ideas to persistent homology. In particular, we derive confidence intervals that allow us 
to separate topological signal from topological noise. 



1. Introduction. Topological data analysis (TDA) refers to a collection of methods for finding 
topological structure in data (Carlsson (2009); Edelsbrunner and Harer (2010)). TDA has been used 
in protein analysis, image processing, text analysis, astronomy, chemistry, and computer vision. 

One approach to TDA is persistent homology, a branch of computational topology that can be 
thought of as an extension of clustering; see Chazal et al. (2009, 2011). Homology measures the 
connected components, tunnels, voids, etc., of a set M. These features are summarized by the Betti 

numbers, denoted by /3o, /?2, Consider Figure 1. The set on the left has a single connected 

component so its zeroth order Betti number is /3o = 1; all higher order homology groups are trivial, 
hence /3j = for all i ^ 0. The set on the right has a single connected component and a hole. Its 
zeroth order Betti number is (3q = 1 and its first order Betti number is (3\ = 1. We define these 
terms precisely in Section 2. 

We are concerned with the following problem. We want to estimate the homology of a set M. We 
do not observe M directly, rather, we observe a sample X\, . . . , X n G ~R. D from a distribution P that 
is concentrated on or near M. For example, Figure 2 shows data S n = {X\, . . . , X n } drawn from a 
distribution on a circle M in the plane, with noise added. The homology of the data set S n is not 
equal to the homology of M; however, the set S E = UILi B{Xi,e), where B(x,e) denotes a ball of 
radius e centered at x, captures the homology of M for an interval of values e. Figure 2 shows S e 
for increasing values of e. 

When e is small, there are n connected components. As e increases, the number of connected 
components eventually decreases to one. Similarly, at a certain value of e, a hole appears and 
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Fig 1. The set on the left has a single connected component. Its zeroth order Betti number is fio = 1. The set on the 
right also has a single connected component; however, in addition, there is a hole. Its zeroth order Betti number is 
fJo — 1 and its first order Betti number is f3\ = 1. All other Betti numbers are zero for both sets. 

then, when e is too large, the hole disappears. Persistent homology is a method for quantifying the 
changing homology of sets by pairing the birth and death of homology generators. 

Computing the homology directly from S e is difficult. Instead, one constructs a set of simplices at 
scale e. For example, Figure 3 shows a sequence of Cech complexes (defined in Section 2) which 
is a collection of simplices defined by connecting points at scale e. These complexes are used to 
extract the topological features. The nerve theorem (see Section 2) guarantees that S £ and the Cech 
complex have the same homology. 

The left plot in Figure 4 shows a barcode plot, which represents the birth and death of these features 
(connected components, holes, etc.) as a function of e. Each barcode can also be represented as 
a pair (x, y) where x is the time of birth and y is the time of death. Plotting these points yields 
a persistence diagram. Points near the diagonal correspond to short barcodes and are informally 
regarded as noise; see the right plot in Figure 4. 

One of the key challenges in persistent homology is to find a way to separate the noise from the 
signal in the persistence diagram. The purpose of this paper is to suggest some statistical methods 
for doing this. 

Goals. There are two main goals in this paper. The first is to derive confidence intervals for certain 
key quantities in persistent homology. The second is to introduce persistent homology to statisti- 
cians. We will focus on simple, synthetic examples in this paper as proof of concept. 

Related Work. Some key references in computational topology are (Bubenik and Kim, 2007; Carls- 
son, 2009; Ghrist, 2008; Carlsson and Zomorodian, 2009; Edelsbrunner and Harer, 2008; Chazal and 
Oudot, 2008; Chazal et al., 2011). An introduction to homology can be found in Hatcher (2002). 
The probabilistic basis for random persistence diagrams is studied in Mileyko et al. (2011) and 
Turner et al. (2012). Other relevant probabilistic results can be found in Penrose (2003); Kahle 
(2009, 2011); Kahle and Meckes (2010). Some statistical results for homology and persistent ho- 
mology include Bubenik et al. (2010); Blumberg et al. (2012); Balakrishnan et al. (2011); Joshi 
et al. (2010); Bendich et al. (2010); Niyogi et al. (2009, 2011); Chazal et al. (2010); Bendich et al. 
(2011). The latter paper considers a challenging example which involves a set of the form in Figure 
5 which we consider later in the paper. Heo et al. (2012) contains a detailed analysis of data on 
a medical procedure for upper jaw expansion that uses persistent homology as part of a nonlinear 
dimension reduction of the data. 
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Fig 2. The union of balls S s = [L B(Xi,e) for four different values of e. For some values of e, such as the value 
depicted in the bottom left figure, the set S s has the same topological structure as the underlying circle. When e is 
large enough, the set S e is topologically equivalent to a ball. 
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Fig 3. The Cech complex — a collection of simplices — for increasing values of e. The topological structure of S e can 
be analyzed using the Cech complex. Persistent homology is a method for tracking the topology of S e as a function of 
e. 
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Fig 4. Left: Barcode plot. Each horizontal bar shows the lifetime of a topological feature as a function of e for the 
homology groups Hq, Hi, and H?,. Significant features have long horizontal bars. Right: Persistence diagram. The 
points in the persistence diagram are in one-to-one correspondance with the bars in the barcode plot. The birth and 
death times of the barcode become the x- and y- coordinates of the persistence diagram. Significant features are far from 
the diagonal. 



Fig 5. The Eyeglass Curve. Given data sampled on this curve, it is challenging to decide whether there is one loop 
or two. 



Outline. We define persistent homology formally in Section 2. The statistical model is defined in 
Section 3. Several methods for constructing confidence intervals are presented in Section 4. Section 5 
illustrates the ideas with a few numerical experiments. In Section 6, we discuss a connection between 
persistence diagrams and point processes. Proofs are contained in Section 7. Finally, Section 8 
contains concluding remarks. 

Notation. We write a n ^ b n if there exists c > such that a n < cb n for all large n. For any x £ M. D 
and any r > 0, B(x,r) denotes the D-dimensional ball of radius r centered at x. For any set A we 
define the (Euclidean) distance function 

(1) dA{x) = inf \\y — x\\. 
For any set A and any e > 0, define 

(2) A®e= \J B(x,e) = {x : d A {x) < e}. 

The reach of a set A — denoted by reach(^4) — is the smallest e > such that each point in A © e 
has a unique projection on A (Federer (1959)). If / is a real- valued fuction, we define the upper 
level set {x : f(x) > t}, the lower level set {x : f(x) < t}, and the level set {x : f(x) = t}. In 
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some places, we use symbols like c, c±, C, . . . , as generic positive constants. For an event A, ¥(A) 
denotes the probability of A on an appropriate probability space. 

2. Review of Persistent Homology. Persistent homology describes how the topology of the 
lower level sets of a real-valued function / change with t. Persistent homology describes / with a 
multiset of points in the plane, each corresponding to the birth and death of a homological feature 
that existed for some interval of t. Features that persist for a large range of values t will be far from 
the diagonal. 

One of the main functions we study is the distance function ds n ■ — > R to a point set S n = 
{X\, . . . , X n }, as defined in (1). Here, Xx,..., X n are sampled from a distribution P whose support 
is a topological space M embedded in ~R D . We use the homology of the lower level sets of ds n in 
order to infer the topology of M. We will also study the upper level sets of density functions later 
in the paper. 

2.1. Homology. Homology is a method for quantifying the topological properties of a set M. Here 
we give a brief introduction; see Hatcher (2002) and Edelsbrunner and Harer (2010) for more details. 
Following a large number of definitions, we give a simple example (Example 2) that illustrates the 
main ideas. 

Informal Description. Given a topological space M, the j'-dimensional Homology group H P (M) 
is an abelian group generated by the p-dimensional holes in M (this will be made more precise 
below). The Betti number j3 p is the rank of H P {M): the value (5q counts the number of connected 
components of M, the value f3\ counts the number of one-dimensional holes, or tunnels, and the 
value /?2 denotes the number of two-dimensional holes, or voids. We write M = N if the topological 
spaces M and N have isomorphic homology groups for all p. 

Complexes. We compute homology using simplicial homology. A simplicial complex K is a set of 
simplices such that the following two conditions hold: 

1. If a is a simplex in K and r is a face of a, then r is also an element of K. 

2. The intersection of two simplices in K is either a simplex in K or empty. 

The topological space obtained by taking the union of simplices in K under the subspace topology 
(inherited from the ambient Euclidean space) is denoted by \K\. 

Often, we are given a set of points S, each point belonging to M. D , and wish to construct a simplicial 
complex from them. A natural way to do this is to choose a radius e and to center a ball of radius e 
at each point in S. The Cech complex is a simplicial complex with the same homotopy type as this 
union of balls. (This means they have similar topological features and, in particular, isomorphic 
homology groups.) 

The Cech complex, denoted by Cech(S, e), is the set of simplices a with vertices v±, . . . € S such 
that 

k 

i=i 
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where B(vi,e) is the ball of radius e centered at V{. As stated above, this simplicial complex shares 
the same homotopy type as the union of the balls centered at the point in S. We formalize this 
statement in the following theorem, which can be found in Edelsbrunner and Harer (2010): 

Theorem 1 (Nerve Theorem). The union of balls S © e and the Cech complex Cech(S', e) have 
the same homotopy type. 

It is common to approximate the Cech complex with the Vietoris-Rips complex V(S,e), which 
consists of simplices with vertices in S and diameter at most 2e. In other words, a simplex a 
is included in the complex if each pair of vertices in a is at most 2e apart. This complex can 
be computed more quickly than the Cech complex, because only pairwise distances need to be 
computed. These complexes satisfy the following sandwich relation: 

Cech(5,e) C V(S,e) C Cech(5, y/2e). 

The sequence of Cech complexes obtained by gradually increasing the radius of the balls creates a 
sequence of inclusions. In general, a filtration is a sequence of topological spaces related by (forward 
or backward) inclusions. In this paper, we will only consider forward inclusions between simplicial 
complexes, so the nitrations will be of the form: 

(3) \K \ \Kt\ •■■ ^ \K n \. 

Homology. Now we give some details about (simplicial) homology using Z2 coefficients. Let S be 
a finite set. The reader might want to think of S as a dense subset of a topological space M of 
interest. Let K be a simplicial complex such as the Cech complex defined above. If <ti, . . . , o~n are 
p-simplices, we define a p-chain to be the formal sum C = Y2j &iO~j where aj G {0, 1} and addition 
is taken modulo 2. Thus, a chain can be thought of simply as a list of simplices: aj is in the chain 
C iff aj = 1. The sum of two chains C + D can then be regarded as the symmetric set difference 
of the two chains. The set of p-chains form an abelian group C p , with the empty simplex as the 
identity element 0. 

The boundary d p a of a simplex a = (xq,xi, . . . ,x p ) is defined to be the sum of its co-dimension 
one faces, i.e., 

p 

dpcr = yV(-i), 
i=0 

where 07— j) is the simplex obtained by removing the vertex x% from a. The boundary of a chain 
C = Y^j a j°~i is defined to be d p C = Yj a j d p aj. It is easy to verify that d p d p+ \a = 0. 

A chain C £ C p is called a cycle if d p C = 0, that is, if the chain has an empty boundary. The set of 
cycles is a group, denoted by Z p . Some elements of Z p are boundaries of chains in C p +\. These are 
called boundary cycles and form a subgroup of Z p denoted by B p . In other words, Z p = kerd p and 
Bp = imc?p+i, where ker and im denote kernel and image. Finally, the homology group is defined 
to be the quotient group 

(4) H p = Z p /B p . 
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Fig 6. The groups C p , Z p and B v are related by the boundary maps d p . The p homology group H p is the quotient 
group Z p j Bp . 

The interpretation is that H p is the group generated by equivalence classes of Z p , where two cycles 
cki and «2 in Zp are considered equivalent if they differ by the addition of boundary cycles; in other 
words, if ai = ct2 + 03, where 03 G B p . We represent the equivalence class containing a £ Z p as 
[a]; hence, we can write [a\] = [02]- If [a] = [0], then we say that a is a trivial cycle; otherwise, 
a is non-trivial. The Betti number f3 p is the rank H p ; this is equivalent to the number of non- 
trivial equivalence classes. The relationships between these groups is summarized in Figure 6. The 
homology of M, denoted by T-L(M) is the collection of these groups {Hp}. 

The following example will clarify the ideas. 

Example 2. Consider the dataset in Figure 7. There are 5 points labeled a, b, c, d, e. The first 
plot shows a simplicial complex fC consisting of the following simplices: 

K = {(a), (6), (c), (d), (e), (a, b), (b,c), (c,d), (d,e), (b,d), (b,c,d)}. 
The second plot shows a chain of 2-simplices 

ax = (a, b) + (b, d) + (d, e) + (e, a) 
and the third shows another chain 

a 2 = (a, b) + (b, c) + (c, d) + (d, e) + (e, a). 
We claim that both chains are cycles. To see this, apply the boundary operator: 

da>i = d(a, b) + d(b, d) + d(d, e) + d(e, a) 

= (a) + (6) + (6) + (d) + (d) + (e) + (e) + (a) 
= 

where we recall that the sum can be interpreted as the set difference. Similarly, dct2 = 0- The fourth 
plot shows another chain 03 = (6, c) + (c, d) + (b, d) and it is easy to see that this is also a cycle. 
So CXl, Q!2) «3 £ Z\. 

Now CK3 is not just a cycle, it is also a boundary cycle, because it is the boundary of the triangle 
(b,c,d): 

d (b, c, d) = (b, c) + (c, d) + (b, d) = a 3 . 
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Fig 7. First: A sirnplicial complex. Second: a cycle «i is highlighted. Third: a different cycle a?, that is homologically 
equivalent to ct\ . Fourth: another cycle Q3 . The cycle Q3 is a boundary cycle since it is the boundary of the triangle 
(b,c,d). ai is equivalent to Q2 since ct\ =02+03- Both cycles «i and «2 surround the same hole (the square formed 
by a,b,d,e). 

Thus 03 G E>2- 

Next, we claim that a± and a 2 are equivalent cycles. Intuitively, this means that they surround 
the same hole (the empty square). Formally, these cycles are equivalent because they differ by a 
boundary cycle: 

The set of all first order cycles is 

Z\ = ({ai, a 2 ,a 3 }, +2) . 

The set of boundary cycles is B\ = ({03}, +2)- The first homology group is H\ =< [a\] > where 
[a%] represents "the equivalence class of cycles corresponding to ol\." Since H\ is generated by one 
cycle, we have that (5\ = 1. 



2.2. Persistent Homology. Although persistent homology has its roots in Morse theory, it grew 
into an area of research in the 1990s; see e.g. Edelsbrunner et al. (2002) and Frosini (1992). In 
what follows, we present a high-level overview of persistent homology and we refer the reader 
to Edelsbrunner and Harer (2008) and Edelsbrunner and Harer (2010) for more details. 

Given data points S n = {X±, . . . ,X n }, we are interested in understanding the homology of the 
topological space M from which the data were sampled. If our sample is dense enough and the 
topological space has a nice embedding in M. D , then H p (M.) is a subgroup of the p th homology 
group of the Cech complex Cech(5 n ,e) for an interval of values of e. Choosing the right e is a 
difficult task: small e will have the homology of k points and large e will have the homology of a 
single point. Using persistent homology, we avoid choosing a single e by assigning a value to each 
non-trivial topological feature that appears in Cech(5 n ,e) as e increases from to infinity. This 
value, which we call the persistence of the feature, is defined to be the length of the interval of e for 
which that feature occurs. To look at Cech(5„,e) for every e in (0, 00) would be infeasible. Hence, 
we restrict our attention to equivalence classes of homology groups. Let ri, . . . , r& be the radii such 
that Cech(5 n , r^ + e) and Cech(5 n , r« — e) are not identical for sufficiently small e. Letting Kq = 0, 
Kk+i be the complete complex, and Ki = Cech(5 n , (rj + rj_i)/2), the complexes form a filtration as 




Fig 8. Left: distance function djj for a set of three points. Right: corresponding persistence diagram. //Xcl 1 , then 
the persistence diagram records the death times of the initial components. Since all components are born at s = 0, all 
points in the persistence diagram appear on the y-axis. The point labeled with oo represents an essential class, i.e., 
one that is inherited from the domain (in this case, M. 1 ). 

given in Equation (3), which induces homomorphisms between the corresponding homology groups: 
(5) H P (\K \) -> iZpfltfil) H p (\K n \). 

Recall the Euclidean distance function given by Equation (1). We notice that d^~(0) = X. In 
addition, we call X nice if there exists a positive 5 small enough such that d^dO, 8}) = X. Persistent 
homology captures the topological changes of the lower level sets of this distance function. For 
convenience, let X s = c^QtM]). If s < t, there is a natural inclusion i s t : X s <-> Xf that induces 
a group homomorphism i* t : H p (lL s ) — > H p (Ht). The ordered set of complexes X s as s varies from 
— oo to oo is called the lower level set filtration of /. (In Section 4.4, we use the upper level set 
filtration, which is defined analagously.) We say that a homology class [a] represented by a p-cycle 
a is born at X s if a is not supported in X r for any r < s and [a] is nontrivial in H p (K s ). The class 
[a] born at X s dies going into (Xt) if t is the smallest index such that the class [a] is nontrivial in 
the image of i* t . 

The birth at s and death at t of [a] is recorded in the p th persistence diagram V p (d%) as the point 
(s,t). 

Definition 3 (Persistence Diagram). The p th persistence diagram V p (dx) is the multiset of points 

— 2 

(s, t) in the extended plane M. such that the points in the diagram are in one-to-one correspondence 
with the (3 p generators of H p (d^). The persistence barcode is a multiset of intervals that encodes 
the same information as the persistence diagram by representing the point (s, t) as the interval 
[s,t]. 

In particular, the 0-dimensional diagram Va(dx) records the birth and death of components of the 
lower level sets; more generally, the p-dimensional diagram V p (dx) records the p-dimensional holes 
of the lower level sets. We let V(f) be the overlay of all persistence diagrams for /; see Figure 8 
and Figure 9 for examples of persistent homology of one-dimensional and two-dimensional distance 
functions. 
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Fig 9. If X C M 2 , then the persistence diagram tracks the 0- and 1-dimensional homology classes. In the example 
shown, there is a non-trivial 1-cycle born at Xe that dies going into X7. This cycle is denoted with the symbol A in 
the persistence diagram. 

2.3. Bottleneck Distance W^. We now define a metric Woo, called the bottleneck distance, that 
measures the distance between two persistence diagrams. Intuitively, the bottleneck distance be- 
tween two persistence diagrams is the maximum distance between points on the first diagram and 
points on the second diagram, after matching the points optimally. See Figure 10. 

We first describe the general matching problem, where we match the elements of a set ACM 2 with 
elements of a second set B C M 2 . A matching between A and B is a set of edges (a, b) with a in 
A and b in B such that no vertex is incident to two edges. A matching is maximal if the addition 
of any edge would result in a graph that is no longer a matching. A matching is perfect if every 
vertex is incident on exactly one edge. In other words, it is a matching where there does not exist 
an unmatched vertex. 

Letting f,g: M — > M, the goal is to obtain a perfect matching between the points in V(f) and in 
V{g) that minimizes the cost associated with the matching. To resolve the issue where the number 
of off-diagonal points in both diagrams is not equal or the diagrams are dissimilar, we allow an 
off-diagonal point to be matched to a point on the line y = x. Hence, we let A be the set of 
off-diagonal points in V(f) along with the orthogonal projection of the off-diagonal points in V{g) 
onto the line y = x. Symmetrically, we let B be the set of off-diagonal points in V{g) along with the 
orthogonal projection of the off-diagonal points in V{f) onto the line y = x. The cost of a matching 
is a function of a cost of the edges in the matching. The cost doo(a,b) between points a and b of 
the persistence diagram V(f) is the distance: 

(6) doo(a,b) = max||a x - b x \, \a y - 

where (a x ,a y ) and (b x ,b y ) are the coordinates of a and b. As we will see next, the bottleneck 
distance minimizes the maximum pairwise cost. The bottleneck matching minimizes the bottleneck 
cost function, which is defined as follows: 

Definition 4 (Bottleneck Cost). The bottleneck cost of a perfect matching P is the maximum 
edge cost in the matching: 

Cr(P) = max doofa, b). 

(0,6) £P 
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Fig 10. Two persistence diagrams are overlaid. The matching between the two diagrams is indicated by adding a 
line segment between paired points in the persistence diagrams. In general, a persistence diagram consists of a finite 
number of off-diagonal points as well as the diagonal y = x. Hence, when we take the pairing, we allow a point to be 
paired with a point on the diagonal, as shown above. 



We minimize this cost over all perfect matchings of A and B to obtain the bottleneck distance 
between the two sets: 

(7) Woo (A,B) = mm C B (P). 

The matching that attains the bottleneck distance is called the bottleneck matching. Thus, if / and 
g are two functions and V(f) and V(g) are the corresponding persistence diagrams, then we define 
the bottleneck distance between the diagrams 

(8) W^VUing)) 
as the above minimum cost of the best matching. 



2.4. Stability. We say that the persistence diagram is stable if a small change in the input function 
produces a small change in the persistence diagram. There are many variants of the stability for 
persistence diagrams, as we may define different ways of measuring distance between functions 
or distance between persistence diagrams. We are interested in using the Lqo distance between 
functions and the bottleneck distance between persistence diagrams. 

A function is tame if it has finitely many homological critical values and if the homology groups 
of each lower level set have finite rank. Let /, g : X — > R be tame functions with X a triangulable 
topological space. The -Loo-distance ||/ — g\\oo is the maximum difference between the function 
values: 

\\f - g\\oo = sup \ f(x) -g(x)\. 

The Loo-distance between / and g is an upper bound for the bottleneck distance between the 
corresponding persistence diagrams: 

Theorem 5 (Bottleneck Stability Theorem Cohen-Steiner et al. (2005)). The bottleneck distance 
between the persistence diagrams of tame functions is bounded from above by the distance 
between them: 

(9) Woo(W)>PG7))<ll/-0lloo. 
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We refer the reader to Cohen-Steiner et al. (2006) for a proof of this theorem. 



2.5. Hausdorff Distance. Let A, B be compact subsets of 1$L D . One way to measure the distance 
between these sets is to take the Hausdorff distance, denoted by H(A,B), which is the maximum 
Euclidean distance from a point in one set to the closest point in the other set: 

H(A, B) = max< maxmin \ \x — y\\, maxmin llx — y\\ \ 
I xeA y£B x£B y£A ) 

= inf|e: A C B © e and B C A © ej. 

Finally, let us note that, from the above, if S is any subset of M, Vs is the persistence diagram 
based on the level sets {x : ds{x) < t} and V is the persistence diagram based on the level sets 
{x : d^i(x) < t} then 

(10) W 00 (Vs,V)<\\ds-d M \\oc = H(S,M)- 

2.6. Computing and Estimating the Homology of Manifolds. Let M C M. D be a smooth sub- 
manifold of M. D and let m be the homological feature size of M, defined as the smallest positive t such 
that the inclusion M 4/ / 2 ^ Mt does not induce an isomorphism on homology. Let S n = {X\, . . . , X n } 
be a set of points sampled from M such that H(5, M) < m/4. We want to use S n to compute H(M), 
even though H(M) cannot be computed directly. We proceed in three steps. 

First. We would like to expand M slightly without changing its homology; that is, for e small 
enough, the following equation should hold: 

M = M © e. 

Indeed, this equation holds for M a smooth sub-manifold of M D (Chazal and Oudot (2008)). 

Second. Let i be the inclusion S n ©m/445„ffi 3m/ '4. Then, the image of the induced map on 
homology is isomorphic to the homology of M © e, for e small enough. That is, M © e = S n © m/4 
up to features with persistence less than 2m (Cohen-Steiner et al. (2005)). This means that we 
can compute the persistent homology for d$^ and read off the homology of M, and in particular 
generators for the homology of M are also generators for the homology of S n © m/4. We will denote 
this relationship with =^ as follows: 

(11) Mffie^5 n ffim/4. 

Third. Finally, we need a way to compute S n © m/4, and we use the nerve theorem to obtain: 

5 n ©m/4^ Cech(5 n ,m/4). 

We summarize our observations: 

(12) M ^ M®e 5 n ©m/4 ^ Cech («S„, m/4) . 
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3. Statistical Model. Let X\, . . . , X n ~ P where I, G K B . Let M denote the support of P. Let 
us define the following quantities: 

(13) p(x,t) = P(jB( !; t/2)) , p(t) = inf p(x,t). 

t a ieM 

We assume that p(x, t) is a continuous function of t and we define 

(14) p(x,lO) = limp(x,t), p = \imp(t). 



Note that if P has continuous density p with respect to the uniform measure on M, then p oc 
inf X £MP( x )- Until Section 4.4, we make the following assumptions: 

Assumption (Al): M is an embedded d-dimensional manifold (closed, no boundary) in M. D 
and reach (M) > 0. 

Assumption (A2): For each x G M, p(x,t) is a bounded continuous function of t, differentiable 
for t G (0, to) an d right differentiable at 0. Moreover, dp(x,t)/dt exists and is bounded away from 
and infinity for t in an open neighborhood of 0. Also, 

< C\ < oo, and sup < C2 < 00, 

o<t<t 

for some to > and some C\ and Ci. 

Remarks: The reach of M does not appear explicitly in the results as the dependence is implicit 
and does not affect the rates in the asymptotics. Note that, if P has a density p with respect to the 
Hausdorff measure on M then (A2) is satisfied as long as p is smooth and bounded away from zero. 
Assumption (Al) guarantees that, as e — > 0, the covering number N(e) satisfies N(e) x (l/e) d . 
However, the conditions are likely stronger than needed. For example, it suffices that M be compact 
and (i-rectifiable. See, for example, Mattila (1999) and Ambrosio et al. (2000). 

Recall that the distance function is c£mOe) = inf ye M \ \x — y\\ and let V be the persistence diagram 
defined by the lower level sets {x : cLm(x) < s}> Our target of inference is V . Let V denote the 
persistence diagram of the {x : ds„(x) < e} where S n = {X±, . . . , X n }. We regard V as an estimate 
of V . Our main goal is to find a confidence interval for W 00 (7',7 3 ) as this implies a confidence set 
for the persistence diagram. 

In Section 4.4, we weaken the assumptions. Specifically, we allow outliers, which means there may 
be points not on M. Bendich et al. (2011) show that methods based on the Cech complex perform 
poorly when there are outliers. We shall see that the methods in Section 4.4 are robust to outliers. 



(15) 



sup sup 

x 0<t<t o 



dp(x,t) 
dt 



4. Confidence Intervals. We will find c n = c n (X\, . . . , X n ) such that 

(16) lim sup P( (V, V) > c n ) < a. 

n— >oo 

It then follows that C„ = [0, c n ] is an asymptotic 1 — a confidence set for the bottleneck distance 
Woo CP, T'), that is, 

(17) limsupP(VocCP,P) G [0,cn]) > 1-a. 

13 



signal 



CD 

Q 




noise 



Birth 

Fig 11. First, we obtain the confidence interval [0, c n ] for VKoo("P, V). If a box of side length 2c„ around a point in the 
diagram hits the diagonal, we consider that point to be noise. By putting a strip of width \f2c n around the diagonal, 
we need only check which points fall inside the strip and outside the strip. 



Recall that, from Theorem 5 and the fact that \\dm — ds n 
(18) W^V.V) <H(5 n , 



H(«S n ,M), we have 



where S n = {X\, . . . , X n } is the sample and H is the Hausdorff distance. Hence, it suffices to find 
c n such that 



(19) 



liminf P(H(«S n 



> On.) < a. 



We interpret c n in the following way: any 
guishable from noise if the box of side lenj 
c n }, intersects the diagonal. We visualize 
the diagonal to the persistence diagram 
significantly different from noise. Points 
a diagram of the form shown in Figure 
removing any point p if a box of side len 



point p on the persistence diagram is considered indistin- 
5th 2c n around p, formally defined as {q 6 M? : doo(p, q) < 
the confidence set by adding a strip of width \[2c n around 
V . The interpretation is this: points in the strip are not 
above the strip are considered real signal. This leads to 
11. (Alternatively, one can simply clean the diagram by 
gth 2c„ centered at p intersects the diagonal.) 



Remark: This simple dichotomy of "signal" and "noise" is not the only way to quantify the uncer- 
tainty in the persistence diagram. Indeed, some points near the diagonal may represent interesting 
structure. One can imagine endowing each point in the diagram with a confidence set, possibly 
of different sizes and shapes. But for the purposes of this paper, we focus on the simple method 
described above. 



The first three methods that we present are based on the persistence diagram constructed from the 
Cech complex. The fourth method takes a different approach completely, and is based on density 
estimation. In this section we define the methods; we illustrate them in Section 5. 



4.1. Method I: Subsampling. In this section, we use subsampling. The usual approach to subsam- 
pling (Politis et al. (1999); Romano and Shaikh (2012)) is based on the assumption that we have 
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an estimator 6 of a parameter 6 such that n^{9 — 9) converges in distribution to some fixed distri- 
bution J for some £ > 0. Unfortunately, our problem is not of this form. Nonetheless, we can still 
use subsampling as long as we are willing to have conservative confidence intervals. 

Let b = b n be such that b = o(n) and b n — > oo. We draw N subsamples S b , . . . ,Sj^ each of size 
b from the data. (In principle, N = (Tj. In practice, it suffices to draw a large number of random 
subsamples.) Let Tj = H(5^, S n ), j = 1, . . . , N. Define 



lj - M.±yy b . 

X 

X 



(20) L b (t) = ^Y / I(T j >t) 

3=1 

and let c& = 2L b ~ 1 (a). 

Theorem 6. Assume that p > 0. Then, almost surely, for all large n, 



(21.) nWooCPiV) > <%) < P(H(<S n ,M) > c b ) <a + I ) + 2 



n I n log n 



4.2. Method II: Concentration of Measure. The following Lemma is similar to theorems in Devroye 
and Wise (1980), Cuevas et al. (2001) and Niyogi et al. (2009). 



Lemma 7. For all t > 0, 

(22) P(H(5 n ,M) > tj < ^J^ d ^V (-np(t)t d j . 

If, in addition, t < mm{a / (2C2) , to} , then 



(23) nWoo(V,V) >t)< P(H(S n ,M) >t)< ^exp (~n^- 

Hence, if t n (a) < min{a/(2C2), to} is the solution to the equation 

(24) 



then 



3 (H(5„,M) >t n (a)j <a. 



Remarks. From the previous Lemma, it follows that, setting t n v 



4 log n 



1/d 



3 (H(5 n ,M) >i n ) < 



2' 



n log n ' 



for all n large enough. The right hand side of (22) is known as the Lambert function (Lambert, 
1758). Equation (24) does not admit a closed form solution, but can be solved numerically. 
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To use the lemma, we need to estimate p. Let P n be the empirical measure induced by the sample 
S n , given by 



1 

P n (A) = -Y / I A (X i 



n 
i=l 

for any measurable Borel set A C MP . Let r n be a positive small number and consider the plug-in 
estimator of p 

f0 ,, ~ • Pn(£pQ,r ra /2)) 

(25) p n = mm , 

Our next result shows that, under our assumptions and provided that the sequence r n vanishes at 
an appropriate rate as n — > oo, p n is a consistent estimator of p. 

i 

Theorem 8. Let r n x (logn/n) d + 2 . T/ien, 

p n - p = Op (r n ) ■ 

Remark We have assumed that d is known. It is also possible to estimate d, although we do not 
pursue that extension here. 

We now need to use p n to estimate t n (a) as follows. Assume that n is even and split the data 
randomly into two halves: S n = (5i in ,52 >n )- Let p\ :n be the plug-in estimator of p computed from 
Si >n and define t\^ n to solve the equation 

. . 2 d+1 ( nt d p ln \ 

(26) ^ exp l — o =a - 

Theorem 9. Let V2 be the persistence diagram associated to 52, ra and t\^ n be the solution to (26). 
Then, 

(27) nWoo(V 2 ,V) >t lt n) <P(H(5 2 ,„,M) >ti, n ) <a + Ol^- J , 

where the probability F is with respect to both the joint distribution of the entire sample and the 
randomness induced by the sample splitting. 

In practice, we have found that solving (26) for t n without splitting the data also works well although 
we do not have a formal proof. Another way to define t n which is simpler but conservative, is to 
define 

(28) $,= (4-lQg(-)V. 
Then t n = u n (l + 0(p n - p)) where u n = log (£)) 3 . Then 



, lo£cn\ 2 + d / Ioe n 

F(H(S n ,M) >i n )=P(H(«S n ,M) >u n ) + <a + ! 



n ) \ n 
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2 + d 



4.3. Method III: The Method of Shells. The dependence of the previous method on the parameter p 
makes the method very fragile. If the density is low in even a small region, then the method above is 
a disaster. Here we develop a sharper bound based on shells of the form {x : jj < p(x, | 0) < 7j+i} 
where we recall that 

(29) p{x, 1 0) = lim p(x, t) = lim P ^f/ 2 ^ 
Let g(v) = G'(v), where G(v) = P(p(X,l 0) < v). 

Theorem 10. Suppose that g is bounded and has a uniformly bounded, continuous derivative. 
Then, for any t < p/(2C\), 

^ nd+l roo r \ 

(30) FiWooiV^V) >t)< P(H(5 n ,M) > t) < / ^ e ~ nvt /2 dv. 



Let 



pi) sw—Ei*'" -1 ' 

n b 

i=l 



where b > 0, V{ = p{X,i, r n ), and 

P n (B(x,r n /2)) 
p{x,r n ) = . 



Theorem 11. Letr n = [^j d+2 and assume the kernel satisfies the usual conditions as in, for 
example, Chapter 1 of Tsybakov (2008). 

(1) We have that 



Hence if we choose b = b n x then 



sup|g(u) - g(v)\ = Op 



V 



1 

logn \ 2 ( d + 2 ) 
n 



(2) Suppose that n is even and that p > 0. Assume that g is strictly positive over its support [p,B]. 
Randomly split the data into two halves: S n = (Si, n ,S2, n )- Let gi }U and pi >n be estimators of g 
and p respectively computed from the first half of the data and define ti^ n to be the solution of the 
equation 

(32) f fM,-^* = .. 
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Fig 12. We plot the persistence diagram corresponding to the upper level set filtration of the function f(x). For 
consistency, we swap the birth and death axes so all persistence points appear above the line y — x. The point born 
first does not die, but for convenience we mark it's death at f(x) = 0. This is analagous to the point marked with oo 
in our previous diagrams. 

Then, 

(33) nWoo(V2,V) >*i,n) <P(H(S 2 , n ,M) >t 1<n ) <a + 0(r n ). 

where V2 is the persistence diagram associated to S2, n and the probability F is with respect to both 
the joint distribution of the entire sample and the randomness induced by the sample splitting. 

4.4. Method IV: Density Estimation. In this section, we take a completely different approach. We 
use the data to construct a smooth density estimator and then we find the persistence diagram 
defined by a filtration of the upper level sets of the density estimator; see Figure 12. A different 
approach to smoothing based on diffusion distances is discussed in Bendich et al. (2011). 

Again, let X%, . . . , X n be a sample from P. Define 

(34, Mx) =^±. K (\\^iy P{u) . 

Then p^ is the density of the probability measure Ph which is the convolution P^ = P ★ where 
K h (A) = h- D K(h- 1 A) and K(A) = J A K(t)dt. That is, P h is a smoothed version of P. Our target 
of inference in this section is Vh, the persistence diagram of the upper level sets of p^. The standard 
estimator for ph is the kernel density estimator 

1 n 1 

(35) Mx) = -Y, W K 

i=l 

It is easy to see that E(p/ l (x)) = Ph(x). Let us now explain why Vh is of interest. 

First, the upper level sets of a density are of intrinsic interest in statistics in machine learning. The 
connected components of the upper level sets are often used for clustering. The homology of these 
upper level sets provides further structural information about the density. 



Xi 
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Second, under appropriate conditions, the upper level level sets of ph may carry topological infor- 
mation about a set of interest M. To see this, first suppose that M is a smooth, compact d-manifold 
and suppose that P is supported on M. Let p be the density of P with respect to Hausdorff measure 
on M. In the special case where P is the uniform distribution, every upper level set {p > t} of p is 
identical to M for t > and small enough. 

Thus, if V is the persistence diagram defined by the upper level sets {x : p{x) > t} of p and 
Q is the persistence diagram of the distance function c2m, then the points of Q are in one-to- 
one correspondence with the generators of iJ(M) and, by Equation (11), the points with higher 
persistence in V are also in 1-1 correspondence with the generators of H(M). For example, suppose 
that M is a circle in the plan with radius r. Then Q has two points: one at (0, oo) representing a 
single connected component, and one at (0, r) representing the single cycle. V also has two points: 
both at (0, \/2ttt) where 1/2-7TT is simply the maximum of the density over the circle. In sum, these 
two persistence diagrams contain the same information; furthermore, {x : p(x) > i] = M for all 

< t < 1/27TT. 

If P is not uniform but has a smooth density p, bounded away from 0, then there is an interval 
[a, A] such that {x : p{x) > t} = M for a < t < A. Of course, one can create examples where no 
level sets are equal to M but it seems unlikely that any method can recover the homology of M in 
those cases. 

Next, suppose there is noise, that is, we observe Yi, . . . , Y n , where Yi = Xi + oEi and ei, ~ $. 

We assume that X\, . . . , X n ~ Q where Q is supported on M. Note that X%, . . . , X n are unobserved. 
Here, <!> is the noise distribution and a is the noise level. The distribution P of Y\ has density 

(36) p{y) = [ ~ u)dQ{u) 

where 4> is the density of £j and ff (z) = a~ D <j>(y / a). In this case, no level set L t = {y : p(y) > t} 
will equal M. But as long as (f> is smooth and a is small, there will be a range of values a < t < A 
such that L t = M. 

The estimator Ph{ x ) is consistent for p if p is continuous, as long as we let the bandwidth h = h n 
change with n in such a way that h n — > and nh n — > oo as n — > oo. However, for exploring 
topology, we need not let h tend to 0. Indeed, more precise topological inference is obtained by 
using a bandwidth h > 0. Keeping h positive smooths the density but the level sets can still retain 
the correct topological information. 

We would also like to point out that the quantities Vh and 7\ are more robust and much better 
behaved statistically than the Cech complex of the raw data. In the language of computational 
topology, Vh can be considered a topological simplification of "P. Vh may omit subtle details that 
are present in V but is much more stable. For these reasons, we now focus on estimating Vh- 

Recall that, from the stability theorem, 

(37) WooiVhM < \\p h - PhWoc- 
Hence it suffices to find c n such that 



(38) 



limsupP(||p/i - p h \ |oo > c n ) < a. 

n— >oo 
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Finite Sample Band. Suppose that the support of P is contained in X = [—C,C] D . Let p be the 
density of P. Let 



be the kernel density estimator and let 

»(») = £f x K (^) dP(«) 

be the mean of p^- 

Lemma 12. Assume that sup x K(x) = K(0) and that K is L-Lipschitz, i.e. \K(x) — K(y)\ < 
L\\x — y\\. Then 

(39) p(sup|K-^||>^<2^^ Tr j exp^^J. 

Remark: The proof of the above lemma uses Hoeffding's inequality. A sharper result can be 
obtained by using Bernstein's inequality, however, this introduces extra constants that need to be 
estimated. 



Corollary 13. Let 5 n solve 
(40) 2 
Then 



[4CLVd\ D ( nblh 2D \ 



(41) sup PfWcoiV^Vh) > S n ) < sup P(snp\\p h - p h \\ > S n ) < 

p e -p \ p G p \ x / 

where V is the set of all probability measures supported on X . 



a 



Now we consider a different finite sample band. Computationally, the persistent homology of the 
upper level sets of ph is actually based on a piecewise linear approximation to ph- We choose a 
finite grid G C M. D and form a triangulation from the grid. Define p\ as follows. For x G G, 
let p\{x) = Ph(x). For x ^ G, define pL(x) by linear interpolation over the triangulation. Let 
p\{ x ) = ^(Ph( x ))- The real object of interest in persistent homology is the persistence diagram vj t 
of the upper level sets of p h {x). As before, 

^oo(^I)<Pi-4lloo- 

But due to the piecewise linear nature of these functions, we have that 

llPfc-Phlloo < ™*\Ph(x) ~ Ph(x)\- 



Using a similar proof as above we get: 
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Lemma 14. Let N = \G\ be the size of the grid. Then 

(42) P( s up||4-pt|| > S) < 2iVexp ^)~J 

Hence, if 



D 



1 , (IN 



then 

(44) P ( sup ||pt _ p t|| >6n ) <a 



This band can be substantially tighter as long we we don't use a grid that is too fine. In a sense, we 
are rewarded for acknowledging that our topological inferences take place at some finite resolution. 



Asymptotic Confidence Band. A tighter — albeit only asymptotic — bound, can be obtained using 
large sample theory. The simplest approach is the bootstrap. 

Let X* , . . . , X* be a sample from the empirical distribution P n and let denote the density 
estimator constructed from X*, . . . , X*. Define the random measure 

(45) Jn(t)=F(VnhP\ffi t -p h \\ 00 >t\ X u ...,X n ) 

and the bootstrap quantile Z a = inf{t : J n {t) < a}. 

Lemma 15. As n — > oo, 

The proof follows from standard results; see for example Theorem 2.6 of Kosorok (2008). As usual, 
we approximate Z a by Monte Carlo. Let T = Vnh D \\ph— pt||oo be from a bootstrap sample. Repeat 
bootstrap B times yielding values T\ , . . . , Tb ■ Let 

B 



Z a = inf 1 2 : — ^ H T j > z) < a 



3=1 

We can ignore the error due to the fact that B is finite since this error can be made as small as we 
like. 

We have emphasized fixed h asymptotics since, for topological inference, it is not necessary to let 
h — > as n — > oo. However, it is possible to let h — > if one wants. Suppose h = h n and h — > as 
n — > oo. We require that nhP I log n — > oo as n — > oo. As before, let Z a be the bootstrap quantile. 
It follows from Theorem 3.4 of Neumann (1998), that 

4+_D 
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Outliers. Now we explain why the density-based method is very insensitive to outliers. Let P = 
ttU + (1 — tt)Q where Q is supported on M, tt > is a small positive constant and U is a smooth 
distribution supported on M. D . Apart from a rescaling, the bottleneck distance between Vp and Vq 
is at most tt. The kernel estimator is still a consistent estimator of p and hence, the persistence 
diagram is barely affected by outliers. In fact, in the examples in Bendich et al. (2011), there are 
only a few outliers which formally corresponds to letting ir = ix n — > as n — > oo. In this case, the 
density method is very robust. We show this in more detail in the experiments section. 

5. Experiments. As is common in the literature on computational topology, we focus on a few 
simple, synthetic examples. This will serve to illustrate the methods and allow us to compare them. 

Example 16. Figure 13 shows the methods described in the previous section (except the method of 
shells) applied to a sample from the uniform distribution over the unit circle (n=500). Each method 
provides a 95 % confidence band around the diagonal for the persistence diagram. The subsampling 
method and the concentration method both correctly show one significant connected component one 
significant loop. The finite sample density estimation method does not have sufficient power to detect 
these features. However, the bootstrap density estimation method does find these features. 

Example 17. Figure 14 shows the methods described in th%e previous section applied to a sample 
from the Normal distribution over the unit circle (n=1000). Each method provides a 95 % confi- 
dence band around the diagonal for the persistence diagram. This is challenging because there is a 
portion of the circle that is very sparsely sampled. The concentration method fails to detect the loop. 
However, the method of shells and the subsampling method both declare the loop to be significant. 

Example 18. Figure 15 shows the methods described in the previous section (except the method of 
shells) applied to a sample from the uniform distribution over the Eyeglasses (n=1000). Each method 
provides a 95 % confidence band around the diagonal for the persistence diagram. All the methods 
find the connected component and one or two loops except the finite sample density estimation 
method. Note that the bootstrap method correctly finds only one loop. 

Example 19. Figure 16 shows circle with some outliers. All the methods fail except the bootstrap- 
based density method which is robust to the presence of outliers. 

Example 20. Figure 17 shows the eyeglass example with outliers. Again, all the methods fail 
except the bootstrap-based density method which is robust to the presence of outliers. 

The last two examples illustrate an important point. Much of the literature on computational 
topology focuses on using the distance function to the data. But as we see here, and as discussed 
in Bendich et al. (2011), such methods are quite fragile. But the density-based methods are very 
insensitive to the presence of outliers. 
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Fig 13. Uniform distribution over the unit Circle. Top left: sample S n - Top right: corresponding persistence dia- 
gram. The black circles indicate the life span of connected components and the red triangles indicate the life span of 
1-dimensional holes. The different 95% confidence bands are computed using Methods I (subsampling) and II (con- 
centration of measure). Note that the uniform distribution does not satisfy the assumptions for the Method of Shells. 
Bottom left: kernel density estimator (bandwidth=0.3) . Bottom right: density persistence diagram. The different 95% 
confidence bands are computed using the methods described in section 4-4- 
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Fig 14. Truncated Normal distribution over the unit Circle. Top left: sample S n - Top right: corresponding persistence 
diagram. The black circles indicate the life span of connected components and the red triangles indicate the life 
span of 1-dimensional holes. The different 95% confidence bands are computed using Methods I (subsampling) , II 
(concentration of measure) and III (method of shells). Bottom left: kernel density estimator (bandwidth— 0.3) . Bottom 
right: density persistence diagram. The different 95% confidence bands are computed using the methods described in 
section 4-4- 
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Fig 15. Uniform distribution over the Eyeglasses curve. Top left: sample S n - Top right: corresponding persistence 
diagram. The black circles indicate the life span of connected components and the red triangles indicate the life 
span of 1-dimensional holes. The different 95% confidence bands are computed using Methods I (subsampling) and 
II (concentration of measure). Note that the uniform distribution does not satisfy the assumptions for the Method 
of Shells. Bottom left: kernel density estimator (bandwidth=0.3). Bottom right: density persistence diagram. The 
different 95% confidence bands are computed using the methods described in section 4-4- 
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Fig 16. Uniform distribution over the unit Circle with some outliers. Top left: sample S n - Top right: corresponding 
persistence diagram. The black circles indicate the life span of connected components and the red triangles indicate 
the life span of 1-dimensional holes. The 95% confidence bands are computed using Methods I (subsampling) and II 
(concentration of measure). Bottom left: kernel density estimator (bandwidth=0.3). Bottom right: density persistence 
diagram. The different 95% confidence bands are computed using the methods described in section 4-4- All the methods 
fail except the bootstrap-based density method. 
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Fig 17. Uniform distribution over the Eyeglasses curve with some outliers. Top left: sample S n - Top right: correspond- 
ing persistence diagram. The black circles indicate the life span of connected components and the red triangles indicate 
the life span of 1-dimensional holes. The 95% confidence bands are computed using Methods I (subsampling) and II 
(concentration of measure). Bottom left: kernel density estimator (bandwidth=0.3). Bottom right: density persistence 
diagram. The different 95% confidence bands are computed using the methods described in section 4-4- All the methods 
fail except the bootstrap-based density method. 
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6. The Persistence Diagram as a Point Process. The persistence diagram can be regarded 
as a point process Cs„{b, d) in the plane. (This was suggested to us by Herbert Edelsbrunner.) 

A persistence diagram is defined by a set of points (61, di), . . . , (b m , d m ) corresponding to birth 
times and death times of topological features, where m = m{S n ) is the number of points in the 
persistence diagram V. For A e R 2 define 

m 

Cs n (A) = Y, I A({(b J ,d J )}) 
3=1 

which simply counts the number of points in A. Following Edelsbrunner, we smooth the process. 
The simplest way to do this is to divide the plane into squares A = {Ai,A%, . . . ,} each of size 
w > 0. The smoothed process is then Es n = {£,s n (A) : A G A}. We endow the set of such processes 
with the loo metric. 

If Ss n is the point process corresponding to the Cech diagram then H n does not converge in 
distribution, at least for Hq. However, if E$ n is the point process corresponding to the density 
persistence diagram Phi then 

for some Gaussian process G. This follows since G n = V nhP (p^ — p^) converges to a Gaussian 
process and the smoothed point process is a continuous function of G n . 

Looking at the variability of this smoothed process over subsampling or bootstrapping gives another 
way to visualize the uncertainty. 

Example 21. The top left plot in Figure 18 shows the Bart Simpson density 

1 1 4 

(47) p(x) = -4>{x- 0, 1) + — Hx; (j/2) - 1, 1/10) 

3=0 

where (j)(x; fi, a) denotes a Normal density with mean [i and standard deviation a. The top right plot 
shows the kernel density estimator based on 1, 000 draws fromp, using bandwidth= 0.05. The bottom 
left plot is the Density Persistence Diagram obtained through the kernel density estimator and the 
highlighted region shows the points of the plane at distance more than 0.34 from the diagonal. A 
bootstrapped 95% confidence interval for the number of points of the Persistence Diagram falling in 
this region is [3,5]. Finally, the bottom right plot shows the smoothed bootstrapped version of the 
Persistence Diagram. 



7. Proofs. Proof of Theorem 6. First, we claim that there exists an event of probability close to 

1 such that, over this event, H(«S n ,M) < H(Sb,S n ) for any subsample S of size b. To see this, let 
\l/d 

^ n = ( p^rP ) anc ^ define the event A n = {H(5 n ,M) < t n }. By the remark following Lemma 7, 

IP(-^n) — n log n > ^ or n l ar § e enough. Let V = {di, . . . , e^v} be a packing set of balls of radius 3t n 
(for n large enough, 3t n will be smaller than half the diameter of M, so this set is non-empty and, 
in fact, of cardinality increasing in re). Now, c log n/n < N' < ^d~\^i from the proof of Lemma 
7. Suppose A n holds and, arguing by contradiction, also that H(Sb,S n ) < H(5 n ,M). Then 



(48) H(5 b , V) < H(S b , S n ) + H(5 n , V) < H(<S n , M) + H(5 n> V) < 2H(5 n , M) < 2t 
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True density 



Kernel estimator, bandwidth=0.05 




Fig 18. Point Process approach. Top left: density defined in (47). Top right: kernel density estimator (band- 
width=0.05). Bottom left: corresponding density persistence diagram. Bottom right: smooth density persistence di- 
agram from bootstrapping. 
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Next, notice that N' — b = (l - 6^f). Thus, if = o(l), then a (1 - o(l)) 

fraction of the balls B(dj,3t n ys contain no points from S b . So H(S b ,T>) > 3£ n , which, in light of 
(48), yields a contradiction. Thus, on A n , H(«S n ,M) < H(S b ,S n ). 

Define 

1 * 

3=1 

and 

J 6 (t) =P(H(5 6 ,M) >t). 

Then L b (t) — J b {t) is a centered [/-process, and from the proof of Theorem 3.3 of Arcones and Gine 
(1994), we have that 

sup | J b (t) - L b (t)\ = O y^p^J a.s. 

On the event A n , 

H(5 6 ,M) < U(S b ,S n ) + H(S n ,M) < 2H(S b ,S n ) 
so that H(S b ,S n ) > H(S b ,M)/2 which implies that 

L b (t) < L b (t/2). 

Hence, almost surely for all large n, 

P(H(«S n ,M) > c b ) < P(H(«S 6 ,M) > c b ) 

= Jb(c b ) < L b (c b ) + O 



b log n 



n 



< uu,.ri)+o\ )+p(^) 



, /61ogn\ 2 d 

<a + Oh/ — + . □ 

n j n log n 



Proof of Lemma 7. Let C = {c\, . . . , cat} be a minimal i/2-covering set for M, for t/2 < diam(M). 
Let Bj = B( Cj ,t/2). Thus, H(C,M) < t/2 and 

P (H(«S n , M) > t) < P (H(5 n , C) + H(C, M) > t) < P (H(S n , C) > t/2) 
= P(Bj n S n = for some j) < ^ P(Bj n S n = 0) 



- P(^)] n < N 1 — /?(t)t d ] n < iVexp (-np(i)i £ 



where the second-to-last inequality follows from the fact that min,- P(Bj) > p(t)t d by definition of 
p{t) and the last inequality from the fact that p(t)t d < 1. Next, let T> = {d±, . . . , djv'} be a maximal 
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t/2-packing set for M and set B'- = B(dj,t/4), for j = 1,...,N'. Then, N < N' and the balls 
{Bj,j = 1 . . . , N'} are disjoint. Therefore 

N ' t d 
l = P(M)>J2P(B' j )>N'p(t/2) ¥ , 

i=i 

where we have used again the fact that min,- PiB'j) > p(t/2)^. We conclude that N < N' < 
{2 d )/{p{t/2)t d ). Hence, 

p(u(s n ,m) >t)< ^y^ ex P {-np(t)t d ) ■ 

Now suppose that t < min{a/(2C2), to} (C*2 and to are defined in Assumption A2). Then, since we 
assume that p(t) is differentiable on [0,to] with derivative bounded in absolute value by C2, there 
exists a < t < t such that 

p(t) = a + tp'(t) >a-C 2 t>^. 
Similarly, under the same conditions on t, we have that p(t/2) > |. The result follows. □ 



Proof of Theorem 8. Let 

P n {B{x,t/2)) 
p(x,t) = - d . 

Note that 

supP(B(a;,r n /2)) < Cr d n 

x&A 

log n \ d + 2 



for some C > 0, by Assumption A2. Let r n = j an d consider all n large enough so that 

p d 2 2 

(49) ^(logra) d + 2 <l and red+2 ~ ^ + 2 lo S n > 

hold. Let <?i.„ be the event that the sample 5 n forms an r n -cover for M. Then, by Lemma 7, 



d f d 



2 d+1 ( n \ d + 2 { p /logn\ d + 2 



1 n) < i ex P S - o n 1 

p Viogn / 1 2 V n 

2^^ d ( (p d \ 2 "I 

< nd+2 exp |— (logn)d+2 J j 

2 d+1 _d_ r 2 

< n d + 2 exp < — n d + 2 ^ 

2 d+1 f _2_ d 

< exp < — n d + 2 + logn 

p [ d + 2 

2 d+i 1 

< , 

p n 

where the third and fourth inequalities hold since n is assumed large enough to satisfy (49). 
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Let C3 = max{Ci, C2} where C\ and C2 are denned in Assumption A2. Let 



2 log n 
l)ri 



(n 



where C = max{4C3,2(C + 1/3)}. Assume further that n is large enough so that, in addition to 
(49), e n < 1. With these choice of e n , define the event 



<?2,n = { max 
i=l, ...,n 



P ^(PpQ, r n /2)) P{B{X h r n /2)) 



<ce n 



where c* satisfies 



n' n 



1 



n 



n' n 



for all n large enough (which exists by our choice of e n and r n ). We will show that P(<?2,n) > 1 — 
To this end, let P l denotes the conditional probability of S n \ {Xi} given X{ and Pj the marginal 
probability of Xi, for i = 1, . . . , n. Then, by the union bound, 



P n (B(Xi,r n /2)) P(B(Xi,r n /2)) 



> c e r 



(50) 



P(£ 2 C ,J<^I 
i=i 

= J2 F (\Pi,n-i(B(Xi,r n /2)) - P(B(Xi,r n /2))\ > c*e n r d n - ^\ 

i=l ^ 
n 

= J> {\Pi,n-i{B{Xi,r n /2)) - P(B{X i; r n /2))\ > e 

i=l 
n 

= ^E, [F (|^ n _ 1 (,B(A J ,r n /2)) - P(B(X<,r n /2))| > e n r^ 



i=i 



where P^n-i is the empirical measure corresponding to the data points S n \{Xi} and in first identity 
we have used the fact that P n (B(Xi,r n /2)) = Pj jn _i(P(Aj, r n /2)) + ~, for all i. 

By Bernstein's inequality, for any i = 1, . . . , n, 

P i , n _ 1 (B(X i ,r n /2))-P(P(X i ,r n /2))| > e B r*) < 2exp - 



< 2exp 
2 



l (n-l)e^ 
2 (C + l/3) 



< 



where in the first inequality we have used the fact that P(B(x,r n /2) (1 — P(B(x,r n /2))) < Cr d . 
Therefore, from (50), 

nsu < -• 



/z 



Let j = argminj p(Xi, r n ) and k = argminj p(X{, r n ). Suppose <?2,n holds and, arguing by contra- 
diction, that 



(51) 



\p(Xj,r n ) - p(X k ,r n )\ = \p n - p(X k ,r n )\ > c*e n . 
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Since £ 2 ,n holds we have \p(Xj,r n ) - p(Xj,r n )\ < c*e n and \p(X k ,r n ) - p(X k ,r n )\ < c*e n . This 
implies that if p(Xj,r n ) < p(X k ,r n ) then p(Xj,r n ) < p(X k ,r n ), while if p(Xj,r n ) > p(X k ,r n ) then 
p(X k ,r n ) < p(Xj,r n ), which is a contradiction. 

Therefore, with probability P ((£i,n D <?2,n) c ) > 1 — — — ^ +2 , the sample points 5 n forms an r n - 
covering of M and 

\p n - minp(Xj,r n )| < c*e n . 

i 

Since the sample S n is a covering of M, 

(52) mm. p(Xi,r n ) - inf p(x,r n ) < max sup r n ) - p(X h r n )\. 

1 zeM i xe B(X u r n ) 

Because p(x,t) has a bounded continuous derivative in t uniformly over x, we have, if r n < to, 

sup \p(x,r n ) - p(Xi,r n )\ < C 3 r n , 

x&B(Xi,r n ) 

almost surely. Furthermore, since p(t) is right-differentiable at 0, 

\p(r n ) ~ p\ < C 3 r n , 

for all r n < to- Combining the last two observation with (52) and using the triangle inequality, we 
conclude that 

\p n ~ p\< c*e n + 2C 3 r n , 

with probability at least 1 — — — , for all n large enough. Because our choice of r n satisfies the 



equation r n = y^pr, the terms on the right hand side of the last displayed are balanced, so that 

i 

I / n f l °g n \ d+2 

\Pn ~ P\ < Ca 



n 

for some C4 > 0. □ 



Proof of Theorem 9. Let Pi denote the unconditional probability measure induced by the first 
random half <Si )n of the sample, P2 the conditional probability measure induced by the second half 
of the sample S2, n given the the outcome of the data splitting and the values of the first sample, 
and Pi ,2 the probability measure induced by the whole sample and the random splitting. Then, 

P(H(S 2 ,n,M) > f 1>n ) = Pi, 2 (H(5 2) n,M) > tx, n ) = Ei (P 2 (H(5 2 ,n,M) > ?!,„)) 

where Ei denotes the expectation corresponding to Pi. By Theorem 8, there exists constant C and 

1 

C such that the event A n that \p\, n — p\ < C {^p^j d+2 has Pi-probability no smaller than 1 — 
for n large enough. Then, 

(53) Ei (P 2 (H(5 2i „,M) >? lin )) < Ei (P 2 (H(5 2>n> M) > ti, n ); An) + Pi(^), 

where, for a random variable X with expectation Kx and an event 8 measurable with respect to 
X, we write E[X;£] for the expectation of X restricted to the sample points in E. 
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Define F(t, p) 



^r^-e P;2 * d - Then, by Lemma 7, 



P 2 (H(5 2 , n ,M) > t 1>n ) < F(t 1<n ,p). 

The rest of the proof is devoted to showing that, on A n , F(ti jn ,p) < a + O ((logn/2) 1 /^" 1 " 2 )). 
To simplify the derivation, we will write 0(R n ) to indicate a term that is in absolute value of 
order O ((log n/2) 1 <( d+2 n , where the exact value of the constant may change from line to line. 
Accordingly, on the event A n and for n large enough, pi >n — p = 0(R n ). Since p > by assumption, 
this implies that, on the same event and for n large, 



Pi 



P 



1 + 0{R n ) and ^- = 1 + 0(R n ) 

P Pl,n 



Now, on A n and for all n large enough, 

2 d+1 

F(t ljTl ,p) = ^ exp 



ti,nP ' V 2 
Pl,n\ 2 d +! 



P J Pi, 



exp 



nt inPl,n[^ 



(l + 0(R n ))F(t ltn ,p 1)n )exp 



ntf n p ltn O(R n )) 



a(l + 0(R n )) 



exp 



nh,n Pl,n 



2 

O(Rn) 



(54) 



o> 



= a(l + 0(i? n )) 
where the last two identities follow from the fact that 
(55) F(ti,n, Pi,n) = ol, 

for all n. 

i 

Next, let t* n = d ■ We then claim that, for all n large enough and on the event A n , t\ )n < t* n . 

In fact, using similar arguments, 

F(t* n , pi, n ) = F(t* n , p)(l + 0(R n )) exp (-nt* n O(R n )) -)■ 0, as n oo, 

since 0(R n ) = o(l) and, by Lemma 7, F(t* n ,p) — > and n — >• oo. 

By (55), it then follows that, for all n large enough, F(ti jn ,p\ >n ) > F(t n ,p\ )n ). Because F(t,p) is 
decreasing in t for each p, the claim is proved. 
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Thus, substituting ti >n with t* n in equation (54) yields 



F(t 1 , n ,p)<F(t* n ,p)<a(l + 0(R n )) [-{t* n ) d p l , 
= a{l + 0{R n )) ^( t * n ) d p(l + 0(R n )) 



0(Rn) 

0(R n ) 



a(l + 0(R n )) 



log n 



n 



+ o(l) 



0{Ru) 



= q(1 + 0(R n ))(l + o(l)) = a + 0(i? n ), 

as n — )• oo, where we have written as o(l) all terms that are lower order than 0(R n ). The second- 
to-last step follows from the limit 



lim 



n— ¥00 \ n 



lim exp < log 
exp < lim log 



logn 

n 

logn 
n 



C 



C 



logn 



n 



logn 



d+2 



d+2 



for some constant C. 

Therefore, on the event A n and for all n large enough, 

P 2 (H(S 2 ,n,M) > ti, n ) <a + 0{Rn). 
Since Fi(A n ) = 0(R n ) (in fact, it is of lower order than 0(R n )), the result follows from (53). □ 



Proof of Theorem 10. Let B = sup a . gM p(x, i 0). Fix some 5 > 0. Choose equally spaced values 
P = 7i < 72 < • • • < 7m < 7m+i = # such that 5 = - jj. Let % = {x : 7j < p(x, | 0) < 7^+1} 
and /i(5 n ,r^j) = sup min \\x — y\\ for j = l,...,m. Now H(5 n ,M) = h(S n ,M.) < maxh(S n ,£lj) 

and so 

m 

P(H(S n ,M) > t) < P(maxh(S n ,nj) > t) < ^P(/i(S„,%) > t). 

3 i=i 

Let Cj = {cji, . . . , CjNj } be a t/2 covering set for Qj, for j = 1, . . . , m. Let Pjfc = B(cjk, t/2), for 
= 1, . . . , Nj. Then, for all j and t < min{7j/ (2C\), to} (Ci and to are defined in Assumption A2), 
there is < t < t such that 

= P(c jk ,t) > p(c jk ,i0) + tp'(c jk ,t) > p(c jk ,l 0) - Cit > ^ 
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and 



P(h(S n , Qj) >t)< P{h{S n , Cj) + h(Cj, %) > t) < P{h{S n , Cj) > t/2) 



< P(B jk n S n = for some k) < ^ P(5 jfe n S n 



p(B jk )F<J2[i-t 



k=l 



dlj 



k=l 

Xj [l - i d | 



k=l 



< Nj exp 



Following the strategy in the proof of Lemma 7, we have P (Qj) > Njt d Qd+i so that iVj < 
2 d+1 P(n j )/(j j t d ). Therefore, for t < min{p/(2Ci), t }, 



2 d+i 



p(u(s n ,M)>t)<—rYl 



exp (-— 

fd 1^ r - eXp 



5=1 

2^+1 /^(u) 



7j 



cxp 



nut 



as (5 — > 0. 



□ 



Proof of Theorem 11. Let 
(56) 



n f—' o 



where Wi = 1 0). Then 

sup \g(v) - g(v)\ < sup \g(v) - g*(v)\ + sup \g*(v) - g(v)\. 

V V V 

By standard asymptotics for kernel estimators, 



swp\g(v)-g*(v)\ <db 2 + P 



log n 
nb 



Next, note that 



K 



v-Wj 
b 



K 



v-Vi 



C\Wi-Vi\ r_n 
b ~ b 



from Theorem 8. Hence, 



W{v)-g{y)\< 
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6 2 ' 



Statement (1) follows. 

To show the second statement, we will use the same notation and a similar strategy as in the proof 
of Theorem 9. Thus, 

P(H(5 a ,„,M) > ti, n ) = Pi, 2 (H(5 2 ,n,M) > t hn ) = Ei (P 2 (H(S 2 ,„,M) > . 

Let A n be the event defined in the proof of Theorem 9 and B n the event that sup„ \g{v) — g{v)\ < r n . 
Then, Pi(A> n B n ) > 1 - 0(l/n). Now, 

Ex (P 2 (H(5 2 ,„,M) >?i,„)) <Ei (P 2 (H(5 2 , n ,M) >t hn ); A n D B n ) + Pi ((A n n B n ) c ) . 

By Theorem 10, conditionally on <Si n and the randomness of the data splitting, 

2 d+l ,oo / x 

(57) P 2 (H(5 2 , n ,M) > t 1>n ) < ^- / ^l e - nv —dv. 



l,n ^ P 

We will show next that, on the event A n n B n , the right hand side of the previous equation is 
bounded by a + 0(r n ) as n — >• oo. The second claim of the theorem will then follows. Throughout 
the proof, we will assume that the event A n H B n holds. 

Recall that ti )H solves the equation 

2 d+l roo~r\ ti 
(58) ^ 9 ±±e-™^ dv = a 

(and this solution exists for all large n.) By assumption, g(v) is uniformly bounded away from 0. 
Hence, g(v)/g(v) = 1 + 0(r n ) and so, 



d+l r°°g(v) _ nv %, 2^ f B g(v) ' 



i-l „ Jp i l n Jp n 



e - nv —dv = ^— / ^^e- n '"—dv + z, 



2 d+i r B -/ a tf 
(l + 0(r„))V-/ ^e-«^^ + z n 

a (1 + 0(r n )) + 0(£ n - p) + z n = a + 0(r„) + z Tl 



where 



Now ti t „, > ci(logn/n) 1//d = u n for some c > 0, for all large n since otherwise the left hand side of 
(58) would go to zero and (58) would not be satisfied. So, for some positive c 2 , C3, 

2 d+1 fP" g(v) _ no A, 
< |p n -p| -^ = 0(|p n -p|). □ 
Proof of Lemma 12. First we show that 

\ph(x) -Ph(y)\ < j^i\\ x ~ 2/11- 
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This follows since 

\Ph{x) ~Ph(y)\ 



< 



1 
1 



K 



x — u 



K 



\y - u\ 



X 



h J \ h 

x — u\ I I \y — u\ 



dP(u) 



h 



h 



dP{u) 



< 



L | \x — y 
hP 



By a similar argument, 



\Ph{x) ~Ph(y)\ < 



L 



h D + l 



\x - y\ 



Divide X into a grid based on cubes A\, . . . ,An with length of size e. Note that N x (C/e) D . Each 



cube Aj has diameter yDe. Let Cj be the center of Aj. Now 



sup \ph(x) -Ph(x)\ =max sup \ph(x) - Ph{x)\ < max \p h (cj) - p h (cj)\ + 2v 



3 xeAj 



where v = ' • We have that 



P(sup\\p h -p h \\ > < P(max\p h (cj) - p h (cj)\ > 5 - 2v^j 

<Y, P (\Ph(c j )-Ph(c j )\>5-2v). 

3 

Let e = ^f^. Then 2v = 5/2 and so 

4LV-D 

p(sup||p),-p h || >(5) <Y^p(\p h (cj)-Ph(cj)\ > • 



Note that Pft(x) is an average of quantities bounded between and K(0) /(nh D ). Hence, by Heoffd- 
ing's inequality 

, ^ , \ i *\ f n5 2 h 2D \ 

P [\Ph(Cj) -p h {Cj)\ >^)< 2eX P [-^2^ J ■ 



Therefore, 



/ \ / n5 2 h 2D \ 

P(sup\\p h - Ph \\>6)<2Nex P (-^ W) ) 



,C\ D ( nS 2 h 2D \ 
2 [ — J exp 

D 



2K 2 {0) ) 
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= 2 I „ , ) exp ' 



2K 2 (0) J ' 



□ 



8. Conclusion. We have presented several methods for separating noise from signal in persistent 
homology. The first three methods are based on the distance function to the data. The last uses 
density estimation. There is a useful analogy here: methods that use the distance function to the 
data are like statistical methods that use the empirical distribution function. Methods that use 
density estimation use smoothing. The advantage of the former is that it is more directly connected 
to the raw data. The advantage of the latter is that it is less fragile, that is, it is more robust to 
noise and outliers. 

We conclude by mentioning some open questions that we plan to address in future work: 



2. 



5. 



We focused on assessing the uncertainty of persistence diagrams. Similar ideas can be applied 
to assess uncertainty in barcode plots. This requires assessing uncertainty at different scales e. 
This suggests examining the variability of H(S E , S e ) at different values of e. From Molchanov 
(1998), we have that 



(59) Vn~^H(S e ,S e ) 
where G is a Gaussian process, 



-w inf 



G(x) 



Lx) 



(60) 



L[x) 



Pe(x) : 



y\\ <t 



t=0 



and p 6 (x) is the mean of the kernel estimator using a spherical kernel with bandwidth e. 
The limiting distribution it is not helpful for inference because it would be very difficult to 
estimate L(x). We are investigating practical methods for constructing confidence intervals 
on H(S £ , S £ ) and using this to assess uncertainty of the barcodes. 

Confidence intervals provide protection against type I errors, i.e., false detections. It is also im- 
portant to investigate the power of the methods to detect real topological features. Similarly, 
we would like to quantify the minimax bounds for persistent homology. 
In the density estimation method, we used a fixed bandwidth. Spatially adaptive bandwidths 
might be useful for more refined inferences. 

It is also of interest to construct confidence intervals for other topological parameters such as 
the degree p total persistence defined by 



= 2 ^ d(x, Diag) ? 



where the sum is over the points in V whose distance from the diagonal is greater than some 
threshold, and Diag denotes the diagonal. 

Our experiments are meant to be a proof of concept. Detailed simulation studies are needed 
to see under which conditions the various methods work well. 

The subsampling method is very conservative due to the fact that b = o(n). Essentially, there 
is a bias of order H(Sb,M.) — H(S n ,M). We conjecture that it is possible to adjust for this 
bias. 

The optimal bandwidth for the density estimation method is an open question. The usual 
theory is based on L% loss which is not necessarily appropriate for topological estimation. 
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